sample <- c("Loschbour", "UstIshim", "Saqqaq", "AltaiNeandertal")
coverage <- c(18.2, 35.2, 13.4, 44.8)
archaic <- c(FALSE, FALSE, FALSE, TRUE)(A few remarks and tips before the practical session)
(We were not supposed to finish everything, so no stress.)
sample coverage archaic
1 Loschbour 18.2 FALSE
2 UstIshim 35.2 FALSE
3 Saqqaq 13.4 FALSE
4 AltaiNeandertal 44.8 TRUE
df[rows, cols]
Indexing by columns (“selecting columns”)
df[rows, cols]
Indexing by rows (“filtering rows”)
The tidyverse is a language for solving data science challenges with R code. Its primary goal is to facilitate a conversation between a human and a computer about data. Less abstractly, the tidyverse is a collection of R packages that share a high-level design philosophy […] so that learning one package makes it easier to learn the next.
The tidyverse encompasses the repeated tasks at the heart of every data science project: data import, tidying, manipulation, visualisation, and programming.
“Western Eurasia witnessed several large-scale human migrations during the Holocene. Here, to investigate the cross-continental effects of these migrations, we shotgun-sequenced 317 genomes—mainly from the Mesolithic and Neolithic periods—from across northern and western Eurasia. These were imputed alongside published data to obtain diploid genotypes from more than 1,600 ancient humans [and about 2,500 present-day humans].”
Our exercises will focus on two MesoNeo data sets:
A great example of how to approach totally unfamiliar data!